Evaluating Speech-Driven IR in the NTCIR-3 Web Retrieval Task

نویسندگان

  • Atsushi Fujii
  • Katunobu Itou
چکیده

Speech recognition has of late become a practical technology for real world applications. For the purpose of research and development in speech-driven retrieval, which facilitates retrieving information with spoken queries, we organized the speech-driven retrieval subtask in the NTCIR-3 Web retrieval task. Search topics for the Web retrieval main task were dictated by ten speakers and recorded as collections of spoken queries. We used those queries to evaluate the performance of our speech-driven retrieval system, where speech recognition and text retrieval modules were integrated. The text retrieval module, which is based on a probabilistic model, indexed only textual contents in documents (Web pages), but did not use HTML tags and hyperlink information in documents. Experimental results showed that a) the use of target documents for language modeling and b) enhancement of the vocabulary size in speech recognition were effective to improve the system performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluating multiple LVCSR model combination in NTCIR-3 speech-driven web retrieval task

This paper studies speech-driven Web retrieval models which accepts spoken search topics (queries) in the NTCIR-3 Web retrieval task. The major focus of this paper is on improving speech recognition accuracy of spoken queries and then improving retrieval accuracy in speech-driven Web retrieval. We experimentally evaluate the techniques of combining outputs of multiple LVCSR models in recognitio...

متن کامل

Experiments on Web Retrieval Driven by Spontaneously Spoken Queries

Motivated to realize the speech-driven information retrieval systems that accept spontaneously spoken queries, we developed a method to collect such speech data derived from the pre-defined search topics that had been systematically constructed for IR research. In order to evaluate both our method and the performance of the document retrieval by using the spontaneously spoken queries, we took p...

متن کامل

Evaluating Speech-Driven Web Retrieval in the Third NTCIR Workshop

Speech recognition has of late become a practical technology for real world applications. For the purpose of research and development in speech-driven retrieval, which facilitates retrieving information with spoken queries, we organized the speech-driven retrieval subtask in the NTCIR-3 Web retrieval task. Search topics for the Web retrieval main task were dictated by ten speakers and were reco...

متن کامل

Collecting Spontaneously Spoken Queries for Information Retrieval

Motivated to realize the speech-driven information retrieval systems that accept spontaneously spoken queries, we developed a method to collect such speech data derived from the pre-defined search topics that had been systematically constructed for IR research. In order to evaluate both our method and the performance of the document retrieval by using the spontaneously spoken queries, we took p...

متن کامل

Keyword recognition and extraction by multiple-LVCSRs with 60, 000 words in speech-driven WEB retrieval task

This paper presents speech-driven Web retrieval models which accepts spoken search topics (queries) in the NTCIR-3 Web retrieval task. We experimentally evaluate the techniques of combining outputs of multiple LVCSR models with a language model(LM) with a 60,000 vocabulary size in recognition of spoken queries. As model combination techniques, we use the SVM learning. We show that the technique...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002